Model Selection

ONNX Optimization

# ONNX Optimization

Parakeet Tdt 0.6b V2 Onnx

NVIDIA Parakeet TDT 0.6B V2 is a model based on automatic speech recognition (ASR) tasks, suitable for English speech-to-text tasks.

Speech Recognition English

Dinov2 Base ONNX

This is the ONNX format version of the facebook/dinov2-base model, suitable for computer vision tasks.

Xlm Roberta Base Language Detection Tfjs

This is a multilingual detection model based on XLM-RoBERTa, supporting recognition of 20 languages.

Text Classification Supports Multiple Languages

Moonshine Tiny ONNX

Moonshine Tiny is a lightweight automatic speech recognition (ASR) model suitable for embedded devices and edge computing scenarios.

Speech Recognition

Prompt Injection Defender Large V0 Onnx

TestSavantAI models are a set of fine-tuned classifiers specifically designed to defend against prompt injection and jailbreak attacks targeting large language models (LLMs).

Text Classification

Transformers English

Prompt Injection Defender Large V0

The TestSavantAI model is a set of classifiers specifically designed to defend against prompt injection and jailbreak attacks in large language models (LLMs). The tiny version is based on the BERT-tiny architecture, balancing security and computational efficiency.

Text Classification

Transformers English

Granite Timeseries Patchtsmixer

A time series forecasting model based on the PatchTSMixer architecture, developed by IBM, suitable for multivariate time series forecasting tasks.

Microsoft Speecht5 Tts ONNX

This is the ONNX format conversion of Microsoft's SpeechT5 text-to-speech (TTS) model, optimized for Transformers.js

Speech Synthesis

Transformers English

Whisper Large V3 Turbo

An ONNX-optimized Whisper large speech recognition model designed for web deployment

Speech Recognition

This is an ONNX-optimized version of the facebook/bart-large-cnn model, primarily used for text summarization tasks.

Text Generation

Timesformer Hr Finetuned K400

TimeSformer-HR is a high-resolution spatiotemporal Transformer model for video, fine-tuned on the Kinetics-400 dataset, suitable for video action recognition tasks.

Video Processing

Bge Reranker V2 M3 Onnx O4

The ONNX O4 version of BGE-RERANKER-V2 is an optimized text reordering model that supports relevance scoring for multilingual text pairs.

Text Classification

Depth Anything V2 Base

Depth-Anything-V2-Base is an ONNX-format depth estimation model adapted for Transformers.js, designed for image depth estimation on the web.

Lakshyakh93 Deberta Finetuned Pii Onnx

This is the ONNX-converted version of the lakshyakh93/deberta_finetuned_pii model, designed to identify Personally Identifiable Information (PII) in text.

Sequence Labeling

Transformers English

MusicGen Small is a Transformer-based music generation model capable of producing high-quality music clips from text descriptions.

Audio Generation

Object detection model based on YOLOv9, adapted for Transformers.js, capable of running in a browser

Object Detection

BGE-M3 is an embedding model that supports dense retrieval, lexical matching, and multi-vector interaction, converted to ONNX format for compatibility with frameworks like ONNX Runtime.

This is the ONNX quantized version of the BAAI/bge-m3 model, supporting three functionalities: dense retrieval, multi-vector retrieval, and sparse retrieval, covering over 100 languages.

GTE-Base is a general-purpose text embedding model capable of converting text into high-dimensional vector representations, suitable for text classification and similarity search tasks.

Xlm Roberta Base Language Detection ONNX

A multilingual detection model based on XLM-RoBERTa, capable of identifying the language category of text.

Text Classification

Chinese Clip Vit Base Patch16

Chinese CLIP model based on ViT architecture, supporting multimodal understanding of images and text

Hubert Base Superb Ks

A voice command recognition model based on the HuBERT architecture, optimized for keyword spotting tasks

Audio Classification

Multilingual E5 Small Onnx

This is a multilingual sentence transformer model that maps text to a dense vector space, supporting semantic search and clustering tasks

Text Embedding English

Xlm Roberta Base Language Detection Onnx

This is the ONNX format conversion of the papluca/xlm-roberta-base-language-detection model, designed for multilingual text classification tasks, supporting detection in 20 languages.

Text Classification

Transformers Supports Multiple Languages

Deberta V3 Base Injection Onnx

This is the ONNX-converted version of the deepset/deberta-v3-base-injection model for detecting prompt injection attacks.

Text Classification

Transformers English

Nougat is a vision-based academic document understanding model capable of converting scientific PDF images into Markdown-formatted text.

Swin2sr Classical Sr X4 64

A classical image super-resolution model based on Swin2SR architecture, capable of upscaling image resolution by 4 times

Image Enhancement

E5 Large V2 Onnx

This is a sentence transformer model that maps sentences and paragraphs into a dense vector space, suitable for tasks such as clustering and semantic search.

Text Embedding English

Whisper Large V2 Onnx Int4 Inc

Whisper is a pre-trained automatic speech recognition (ASR) and speech translation model, trained on 680,000 hours of labeled data, demonstrating strong generalization capabilities. This repository contains the INT4 weight-only quantized version of the Whisper large v2 model in ONNX format.

Speech Recognition

Whisper Large Onnx Int4 Inc

Whisper is a pre-trained model for automatic speech recognition (ASR) and speech translation. This repository provides the Whisper large model in ONNX format with INT4 weight quantization, powered by Intel® Neural Compressor and Intel® Transformers Extension.

Speech Recognition

Bge Large En V1.5 Quant

Quantized (INT8) ONNX variant of BGE-large-en-v1.5 with inference acceleration via DeepSparse

Transformers English

Clip Vit Large Patch14

OpenAI's open-source CLIP model, based on Vision Transformer (ViT) architecture, supporting joint understanding of images and text.

Wav2vec2 Large Xlsr 53 English

Large-scale speech recognition model based on the wav2vec 2.0 architecture, supporting English speech-to-text conversion

Speech Recognition

Clip Vit Base Patch32

CLIP model developed by OpenAI, based on Vision Transformer architecture, supporting joint understanding of images and text

Clip Vit Base Patch16

OpenAI's open-source CLIP model, based on Vision Transformer architecture, supporting cross-modal understanding of images and text

Sbert All MiniLM L6 With Pooler

This is an ONNX-converted model based on sentence-transformers/all-MiniLM-L6-v2, capable of mapping sentences and paragraphs into a 384-dimensional dense vector space, suitable for tasks like clustering or semantic search.

Transformers English

Vit Base Patch16 224

Image classification model based on Transformer architecture, pre-trained and fine-tuned on ImageNet-21k and ImageNet-1k datasets

Image Classification

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase